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Abstract 

Genome-wide association studies (GWAS) have identified chromosomal loci that affect risk of coronary heart disease (CHD) 
independent of classical risk factors. One such association signal has been identified at 6q23.2 in both Caucasians and East 
Asians. The lead CHD-associated polymorphism in this region, rsl 21 90287, resides in the 3' untranslated region (3'-UTR) of 
TCF21, a basic-helix-loop-helix transcription factor, and is predicted to alter the seed binding sequence for miR-224. Allelic 
imbalance studies in circulating leukocytes and human coronary artery smooth muscle cells (HCASMC) showed significant 
imbalance of the TCF21 transcript that correlated with genotype at rsl 21 90287, consistent with this variant contributing to 
allele-specific expression differences. 3' UTR reporter gene transfection studies in HCASMC showed that the disease- 
associated C allele has reduced expression compared to the protective G allele. Kinetic analyses in vitro revealed faster RNA- 
RNA complex formation and greater binding of miR-224 with the TCF21 C allelic transcript. In addition, in vitro probing with 
Pb 2+ and RNase T1 revealed structural differences between the 7CF27 variants in proximity of the rs12190287 variant, which 
are predicted to provide greater access to the C allele for miR-224 binding. miR-224 and TCF21 expression levels were anti- 
correlated in HCASMC, and miR-224 modulates the transcriptional response of TCF21 to transforming growth factor-(3 (TGF- 
P) and platelet derived growth factor (PDGF) signaling in an allele-specific manner. Lastly, miR-224 and TCF21 were localized 
in human coronary artery lesions and anti-correlated during atherosclerosis. Together, these data suggest that miR-224 
interaction with the TCF2 1 transcript contributes to allelic imbalance of this gene, thus partly explaining the genetic risk for 
coronary heart disease associated at 6q23.2. These studies implicating rsl 21 90287 in the miRNA-dependent regulation of 
TCF21, in conjunction with previous studies showing that this variant modulates transcriptional regulation through activator 
protein 1 (AP-1), suggests a unique bimodal level of complexity previously unreported for disease-associated variants. 

Citation: Miller CL, Haas U, Diaz R, Leeper NJ, Kundu RK, et al. (2014) Coronary Heart Disease-Associated Variation in TCF21 Disrupts a miR-224 Binding Site and 
miRNA-Mediated Regulation. PLoS Genet 10(3): e1004263. doi:1 0.1 371/journal.pgen.1 004263 

Editor: Mark I. McCarthy, University of Oxford, United Kingdom 

Received November 25, 2013; Accepted February 11, 2014; Published March 27, 2014 

Copyright: © 2014 Miller et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits 
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. 

Funding: This work was supported by NIH grants R01 HL1 03635 (TQ), R01HL109512 (TQ), NIH training grant T32 HL094274 (CLM), grants from the Leducq 
Foundation (TQ, JE, HS). Further grants were received from the German Federal Ministry of Education and Research (BMBF) in the context of the e;Med program 
(e:AtheroSysMed) and the FP7 European Union project CVgenes@target (261 123) (JE, HS). The study was supported by the local focus programs "Kardiovaskulare 
Genomforschung" and the "Forschungsforderung project E24-201 1}" of the Universitat zu Lubeck. The funders had no role in study design, data collection and 
analysis, decision to publish, or preparation of the manuscript. 

Competing Interests: The authors have declared that no competing interests exist. 

* E-mail: clintm@stanford.edu (CLM); tomq1@stanford.edu (TQ); sczakiel@imm.uni-luebeck.de (GS) 

9 These authors contributed equally to this work. 



Introduction 

Coronary heart disease (CHD), involving atherosclerosis and 
myocardial infarction (MI), is a genetically complex trait and 
represents the leading cause of mortality worldwide. Meta-analyses 
of genome-wide association studies (GWAS) for CHD have 
identified 46 replicated loci in subjects of European descent [1]. 
Of these loci, the region at 6q23.2 contains the lead variant, 
rs 1 2 1 90287, which had the lowest P value among several SNPs that 
reached the genome-wide significance threshold in this locus [2]. 



rsl 2 190287 is located within an exon of the basic-helix-loop-helix 
transcription factor TCF21, and represents an expression quantita- 
tive trait locus (eQTL) for this gene by regulating TCF21 gene 
expression levels in omental adipose and liver tissues [2]. 
Importantly, the TCF21 locus association with CHD was recently 
confirmed in a meta-analysis of predominandy European subjects 
genotyped with the Cardio-Metabochip (Illumina) and in a three 
stage GWAS for CHD in individuals of Han Chinese descent [1,3]. 

The association of TCF21 with CHD is particularly compelling, 
given its association with fundamental cardiovascular embryonic 
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Author Summary 

Both genetic and environmental factors cumulatively 
contribute to coronary heart disease risk in human 
populations. Large-scale meta-analyses of genome-wide 
association studies have now leveraged common genetic 
variation to identify multiple sites of disease susceptibility; 
however, the causal mechanisms for these associations 
largely remain elusive. One of these disease-associated 
variants, rsl 21 90287, resides in the 3'untranslated region 
of the vascular developmental transcription factor, TCF21. 
Intriguingly, this variant is shown to disrupt the seed 
binding sequence for microRNA-224, and through altered 
RNA secondary structure and binding kinetics, leads to 
dysregulated TCF21 gene expression in response to 
disease-relevant stimuli. Importantly TCF21 and miR-224 
expression levels were perturbed in human atherosclerotic 
lesions. Along with our previous reports on the transcrip- 
tional regulatory mechanisms altered by this variant, these 
studies shed new light on the complex heritable mecha- 
nisms of coronary heart disease risk that are amenable to 
therapeutic intervention. 

events that might relate to subsequent responses to cardiovascular 
injury. Tcf21 has recently been shown to regulate cell-fate 
determination and stages of cell differentiation throughout 
coronary vascular development in mice. Tcf21 was shown to 
mark populations of mesodermal-derived cells in the proepicardial 
organ (PEO) at embryonic day 9.5, and mesenchymal-derived cells 
in the developing pericardium at later time points [4—7]. Global 
knockout studies in mice have confirmed an important role for 
Tc/21 in the formation of coronary artery smooth muscle cells and 
cardiac fibroblasts [8,9] . Tc/21 deletion results in aberrant smooth 
muscle cell (SMC) differentiation and an absence of cardiac 
fibroblasts, as evidenced by increased epicardial SMC marker 
expression [8]. Together, these mouse studies suggest that loss of 
TcJ21 expression leads to SMC expansion while sustained 
expression is essential to cardiac fibroblast maturation, likely 
through regulation of multipotent precursor cell fate. 

Recent work in this laboratory has identified a a'j-acting 
mechanism by which the protective TCF21 G allele at variant 
rs 1 2 190287 disrupts an activator protein 1 (AP-l)-like enhancer 
element, to alter allele specific transcriptional control of TCF21 
gene expression [10]. Interestingly, this cis-regulatory element 
modulates growth factor (platelet-derived growth factor receptor 
beta- (3) and epicardial development (Wilms tumor 1) signaling 
pathways in coronary artery SMC [10]. In complementary studies 
reported herein we provide evidence that the 3 '-untranslated 
region (3'-UTR) of TCF21 binds miR-224 to regulate expression 
of this gene, and that this regulation is obviated by the minor allele 
which confers a seed mismatch to disrupt miR-224 binding and 
accessibility of this region of the TCF21 3'-UTR. To our 
knowledge, these data provide the first example of miRNA 
binding disruption as a likely mechanism for a CHD risk gene 
association, and the first example of concurrent miRNA and 
transcriptional regulation at a single disease associated causal 
variant. 

Results 

Allelic expression imbalance at rsl 21 90287 in circulating 
leukocytes and HCASMC 

To better understand the mechanisms of disease risk at 6q23.2, 
we set out to define causal variation among the CHD-associated 



SNPs by examining the allele specific expression (ASE) in 
heterozygous individuals for the transcript variant rsl2190287, 
which is located in the 3'-UTR of the TCF21 gene. By measuring 
the relative ASE within individuals, this approach has the ability to 
maximize detection of a'r-regulatory variation on TCF21 gene 
expression, with each allele controlled by similar fram-acting and 
environmental influences. Based on TaqMan SNP genotyping 
assays of total white blood cell RNA and genomic DNA from 22 
heterozygous individuals (from GENEPAD cohort), we observed 
an approximate 1.3-2.0 fold ASE of the minor protective allele (G) 
over the major risk allele (C) in 18/22 samples, P= 1.1x10 
(Fig. 1A). Importantly we observed consistent allelic imbalance 
(1.8—2.5 fold ratio G/C) in primary human coronary artery 
smooth muscle cells (HCASMC) maintained under normal 
conditions and detected using pyrosequencing assays (Fig. IB). 
Together these data suggest that the disease-associated risk allele, 
or other variants in tight LD, contribute to decreased TCF21 
allele-specific expression. Intriguingly, these results contrast with 
published eQTL data at rsl2190287, which demonstrate the risk 
allele is associated with elevated TCF21 expression in omental 
adipose and liver tissues [2,1 1]. Also, our recent work elucidated a 
bi-directional mechanism involving both fraar-activating AP- 1 and 
tenj-repressing (Wilms tumor 1) WT1 transcription factor binding 
to a ay-regulatory element at rs 1 2 1 90287 resulting in altered allele- 
specific TCF21 expression levels [10]. Given this bi-directional 
mode of transcriptional regulation we explored alternative 
regulatory mechanisms to potentially explain the allelic imbalance 
at rsl2190287. 

Predicted miR-224 binding and altered TCF21 3'-UTR 
secondary structure at rsl 21 90287 

Recent studies using allelic imbalance sequencing demonstrate 
that SNPs frequently alter microRNA-mediated repression, by 
creating or disrupting complementary miRNA binding sites [12]. 
In silico analyses, based on conservation of miRNA seed regions, 
predict >60% of human 3'-UTRs are under selective control by 
miRNAs [13,14]. We scanned the TCF21 3'UTR for seed 
matches using both TargetScan and MiRanda prediction algo- 
rithms. Both tools identified rsl2190287 (position 1058 from 5'- 
UTR) residing within a 7-mer mammalian conserved binding site 
for mature miR-224 (Fig. 2A). Alignment of rsl 2 190287 major 
and minor alleles demonstrated a perfect seed match between the 
TCF21 3'-UTR containing the major risk allele (C) and miR-224 
(nucleotides 2-8; positions 1042-1061), with AAG=-2.43 and 
a seed mismatch between the minor protective allele (G) and 
miR-224 (AAG = 4.67) (Fig. 2B). 

We also investigated the RNA secondary structure of the TCF21 
3'-UTR variants. Systematic in silico RNA structural predictions 
were performed and analyzed as previously described [15,16]. 
Representative predicted local secondary structures of TCF21 
rsl2190287 C and G variants are shown from positions 941-1 141 
(Fig. 2B). While both 3'-UTR variants adopt similar global RNA 
structures, they are predicted to adopt distinct local secondary 
structures in proximity of the SNP. For instance, the seed 
matching sequence of the C variant seems to be mostly located 
within a loop structure and overall, the segment complementary to 
miR-224 (shaded grey) are located in a structurally accessible local 
structure (Fig. 2C). In contrast, the miR-224 binding sequence 
segment of the G variant is located in a local structure that does 
not seem to be accessible, i.e., the seed-matching element is 
located near a stem-loop junction within an intra-molecular 
duplex element (Fig. 2C). Theses observations were consistent 
among the 180 different structures analyzed, with the C variant 
SNP typically located in a loop structure and the G variant SNP 
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Figure 1. Allele specific expression at rsl 21 90287 in human peripheral blood samples. (A) TaqMan quantitative PCR results depicting 
allele specific expression of TCF21 at rs12190287 in peripheral blood samples (n = 22 heterozygous samples) obtained from human cohort studies. 
Allelic expression and genotyping data were determined from cDNA and gDNA, respectively, using a pre-calibrated TaqMan SNP genotyping probe 
for rsl 21 90287, with each sample performed in triplicates. Data are expressed as the normalized allelic ratio of cDNA/gDNA and values represent 
mean ± SEM. (B) Representative pyrosequencing traces from HCASMC cDNA and gDNA from various cell lots. Allelic ratios were determined from the 
area under the curve for major and minor allele. Similar results were observed from three independent experiments. P values shown were calculated 
from combined data using a paired f-test compared to allelic ratio of gDNA samples. Asterisks represent individual level of significance versus 
expected allelic ratio of 1.0, * P<0.05, **P<0.01, ***P<0.001. 
doi:10.1371/journal.pgen.1004263.g001 



often located along the stem (Fig. 2D). Similar differences in local 
RNA structure were predicted using the RNAfold minimal free 
energy (MFE) prediction algorithm (Supplementary Fig. SI). The 
overall difference in MFE for these structures is predicted to be 
only 2 kcal/mol, suggesting that the structure containing the C 



variant is slightly less stable. Different local RNA structures could 
also involve differential recruitment of RNA binding proteins 
(RBP), such as Pumilio, previously shown to alter p27 3'-UTR 
local structure and miR-221/222 accessibility [17]. In summary, 
these significant allelic structural differences implicate differences 



PLOS Genetics | www.plosgenetics.org 



3 



March 2014 | Volume 10 | Issue 3 | e1004263 



MicroRNA-224-Mediated Regulation at TCF21 Locus 



B 



5'-UTR 



AUG 
279 



CDS 



rs12190287 

pos. 1058 

3'-UTR 



"/A 



3249 
— I 3' 



stop • ' 
81.6' 



miR-224 binding site 

pos. 1042-1061 



TCF21 3'-UTR C 


5'. 


V 

.GGGCUGAGAACUUCG-GUGACUUC. . 

1 1 1 1 : I I I I I I I 

UUGCCUUGGUGAUCACUGAAC 


.3' 


miR-224 


3' 


5' 


TCF21 3-UTR G 


5'. 


V 

. GGGCUGAGAACUUCG-GUGAGUUC . . 
I I I I I I I I I I I 

UUGCCUUGGUGAUCACUGAAC 

UUGCCUUGGUGAUCACUCAAC 


. 3 ' 


miR-224 
miR-224_SNP 




5' 



in 



ra 
ra 



100 

90 • 
80 • 


78 


■ SNP in stem 

■ SNP in loop 
SNP in bulge 


70 ■ 


■ 


66 


60 






50 ■ 






40 ■ 


I 


1 28 


30 ■ 






20 






10 
0 






TCF21 C 


TCF21 G 



TCF21 C 




O : O 



Figure 2. Predicted interaction of miR-224 with TCF21 variant 1 3 -UTR and secondary structural changes at rsl 21 90287. (A) 

TargetScan and MiRanda prediction algorithms identified rs12190287 (position 1058) residing within a 7-mer, mammalian conserved binding site for 
mature miR-224 (represented as an exact seed match for nucleotides 2-8; positions 1042-1061) (Fig. 2A). (B) Alignment of rs12190287 major and 
minor alleles demonstrated a perfect seed match between the TCF21 3'UTR containing the major risk allele (C) and miR-224, and a seed mismatch 
between the minor protective allele (G) and miR-224. (C) Systematic in silico RNA secondary structure predictions were performed as described in the 
text. Representative predicted structures of TCF21 rsl 21 90287 C and G variants are shown from positions 941-1 141 (Fig. 2B). While both variants have 
similar global structures, they adopt distinct local secondary structures in proximity of the SNP. (D) Summary of SNP rsl 21 90287 location within 1 80 
analyzed RNA secondary structures, resulting in major risk allele (TCF21 C) typically located in loop region, while minor protective allele (TCF21 G) was 
often located in the stem region. 
doi:1 0.1 371 /journal.pgen.1 004263.g002 



in miR-224 accessibility, binding kinetics, and binding affinity, 
which may impact miR-224 mediated regulation of TCF21. 

miR-224 dependent post-transcriptional regulation of 
TCF21 3'UTR at rsl 21 90287 

We first evaluated the possibility that the TCF21 3'-UTR 
variants at rsl 2 190287 differentially regulate protein expression 
through miR-224 targeting using a pmiR-GLO luciferase reporter 
system. The TCF21 3'-UTR variants (containing the major or 
minor alleles) were inserted downstream of the firefly luciferase 
gene, luc2 to quantitatively measure post-transcriptional effects of 
miRNA activity, as previously described [18]. We synthesized 
miR-224 guide and passenger strands using miRBase sequences to 
generate double-stranded miR-224, which has a matched seed 
sequence of mature miR-224 to the C allele of the TCF21 3'-UTR 
target site but a mismatch to the G allele of TCF21 3'-UTR 
(Supplementary Fig. S2, top). In order to test the specificity of 
the miRNA-mediated regulation of the C variant we restored 



base-pairing in the seed region by synthesizing a miR-224 guide 
strand with a G>C substitution (referred to as miR-224_SNP, 
Supplementary Fig. SI, bottom). Using HCASMC co-transfected 
with the TCF21 3'-UTR reporters and double-stranded miR-224, 
we observed selective repression of the C variant compared to the 
G variant (Fig. 3A). Allele-specific differences in reporter activity 
were abolished when we co-expressed the adapting miR- 
224_SNP. Alternatively, using a loss-of-function approach with a 
selective miR-224 inhibitor, we observed increased reporter 
activity only by the C variant. These results further suggest that 
the C variant of the TCF21 3'-UTR can be directly regulated by 
miR-224, while the G variant cannot. We also observed similar 
functional effects in the aortic smooth muscle cell line A7r5 
(Fig. 3B) and HeLa cells (Fig. 3C). However, the observation that 
miR-224_SNP did not completely block the allele-specific reporter 
activity in HeTa, may suggest cell type differences in endogenous 
miR-224 levels. For instance, both TCF21 and miR-224 are 
weakly expressed in A7r5 and HeLa cells relative to HCASMC 



PLOS Genetics | www.plosgenetics.org 



4 



March 2014 | Volume 10 | Issue 3 | e1 004263 



MicroRNA-224-Mediated Regulation at TCF21 Locus 



(unpublished observations). Taken together, these data support a 
functional role of miR-224 in various cell types including 
HCASMC, by preferentially targeting the TCF21 3'UTR C 
variant, leading to post-transcriptional repression. 

Distinct in vitro annealing kinetics between miR-224 and 
TCF21 3'UTR variants 

A striking positive relationship exists between the extent of 
regulation and the annealing kinetics of the RNA regulator and its 
target RNA [19,20]. Thus, differential regulation of TCF21 
3'UTR variants by miR-224 could result from altered kinetics of 
mRNA:miRNA complex formation. We monitored the annealing 
kinetics of miR-224 binding to the TCF21 3'-UTR C and G 
variants in vitro under experimental conditions that are assumed to 
mimic known cellular facilitators of RNARNA annealing [2 1] . It 
is important to note that RNARNA annealing can be substantially 
promoted even at the cost of binding energy, i.e. the association of 
complementary ribonucleic acids can be greatly increased without 
lowering the Arrhenius activation energy or even significandy 
altering RNA structure [21]. The full-length TCF21 3'-UTR 
variants were generated by in vitro transcription (IVT) and 
incubated in 10-fold excess with 32 P-labeled miR-224 for various 
time points, followed by autoradiography detection. Interestingly, 
greater amounts of the C variant 3'-UTR:miRNA complexes were 
generated over time compared with the G variant (Fig. 4A, left 
panel). Further, the C variantmiRNA complex formation was 
completely blocked in the presence of J2 P-labeled miR-224_SNP, 
which generates a seed mismatch (Fig. 4A, right panel). However, 
the miR-224_SNP, which would generate a seed match with the G 
variant had no effects on G variant:miRNA complex formation. 
The C variant TCF21 3'-UTR:miR-224 complexes also formed at 
a faster rate (k obs = 2.2x1 0 6 M -1 s -1 ) than the G variant 
(k„/, s = 1.4xl0 6 M 1 s ') as determined from second-order reac- 
tions (Fig. 4B and 4C). Together these data suggest that miR-224 
preferentially binds the major risk C variant of TCF21, and at a 
faster rate in vitro, compared to the minor protective G variant. 

Different structural conformations of TCF21 3'-UTR RNA 
determined by in vitro probing 

We then used RNA in vitro probing to test and validate our in 
silico secondary structure predictions for the TCF21 3'-UTR 
variants, which demonstrated allele-specific local structural alter- 
ations. Briefly, we chemically probed the TCF21 3'-UTR variants 
with Pb 2+ (to monitor all unpaired nucleotide residues) and probed 
enzymatically using RNase Tl (only cleaves unpaired G nucleo- 
tide residues). After probing, the cleavage patterns were evaluated 
by primer extension and subsequent denaturing gel electrophore- 
sis, as described [18]. Probing the TCF21 3'-UTR variants 
(positions 1040-1075) with Pb 2+ revealed unique cleavage sites 
proximal to the miR-224 target site (positions 1042-1061) and 
rsl2 190287 (position 1058) (Fig. 5A). For instance, the C variant 
has stronger and additional sites located proximal to rsl2190287 
(positions 1058-1063) in comparison to the G variant, which has 
some unique weak cleavage sites at positions 1045-1049. The 
specificity of RNase Tl to cleave G residues explains the 
occurrence of an additional weak cleavage product at the SNP 
position (1058) for the G but not C IVT (Fig. 5B). Additional 
stronger cleavage by RNase Tl was observed at position 1054 of 
the G variant, and a weaker cleavage at position 1070 of the G 
variant, summarized below (Fig. 5C). Together, these results are in 
line with the in silico predicted RNA structures, which imply there 
are a number of local structural differences between the two 
variants, resulting in altered accessibility at sites near rsl2190287. 



It should be noted, however, that the structure-function relation- 
ship of RNA-RNA annealing is complex. Since the pairing 
mechanism of this TCF21 case is not known, we cannot relate local 
structures, annealing kinetics, and biological effects. Nonetheless 
we observe differences at all levels of interaction, strongly 
suggesting a mechanistically distinct regulation. 

TGF-p and PDGF-BB signaling mediate inverse-correlated 
miR-224 and TCF21 expression and ASE in HCASMC 

Next, we investigated the regulatory pattern of endogenous 
TCF21 and miR-224 gene expression levels in HCASMC. We first 
explored a potential link between relevant pathways upstream of 
miR-224 and TCF21 that may account for miR-224- TCF21 3'- 
UTR allele-specific regulation in HCASMC. Importantly, our 
previous work identified platelet-derived growth factor (PDGF) 
and transforming growth factor-beta (TGF-fi) dependent signaling 
pathways as respective positive and negative mediators of cis- 
regulatory elements at rsl2 190287 in HCASMC [10]. PDGF-BB 
ligand mediates increased SMC proliferation, survival, and 
migration [22] through PDGFRfi, which is critical for epithelial- 
mesenchymal transition (EMT) and formation of coronary artery 
SMC [23]. As a pleiotropic vasoactive cytokine, transforming 
growth factor beta (TGF-P 1) also regulates EMT and diverse 
SMC growth and remodeling processes [24]. Interestingly, we 
observed a modest negative correlation (r= —0.3287) of endoge- 
nous TCF21 and miR-224 expression levels in HCASMC treated 
with PDGF-BB, although this result did not reach statistical 
significance (Fig. 6A). However, TGF-P 1 treatment resulted in 
significant and highly inverse-correlated endogenous TCF21 and 
miR-224 expression levels, r= -0.7061, P = 0.0015 (Fig. 6B). We 
then measured the effects of these stimuli on miR-224-mediated 
regulation of total and allele-specific TCF21 transcript levels. As 
expected, PDGF-BB treatment led to increased total TCF21 
expression levels, whereas TGF-P 1 led to reduced TCF21, which 
was blunted in all cases by pre-miR-224 (Fig. 6C). We also 
observed pre-miR-224 to attenuate both PDGF-BB and TGF-P 1 
stimulated allele-specific TCF21 expression (shown as the normal- 
ized ratio of C/G at rsl2 190287) (Fig. 6D). These results identify 
PDGF-BB and particularly TGF-P 1 as potential upstream 
mediators of miR-224 directed allele-specific TCF21 expression 
at rsl2190287. 

TCF21 and miR-224 expression in human atherosclerotic 
coronary artery lesions 

To establish a potential role of miR-224-TCF21 regulation 
during atherosclerosis progression, we measured endogenous levels 
of miR-224 and TCF21 in human coronary artery lesions. 
Immunohistochemical staining of adjacent sections demonstrated 
TCF2 1 protein localized within the neointimal and medial layers 
of the left anterior descending (LAD) coronary artery (n = 4) 
(Fig. 7A, upper panel). TCF21 marked a population of cells 
resembling smooth muscle cells, indicated by alpha-smooth muscle 
actin (a-SMA) immunoreactivity in similar regions. TCF2 1 protein 
was also detected in the adventitia in a few samples, consistent with 
the expression pattern observed in small intramyocardial coronary 
arteries [25]. We also localized endogenous miR-224 in these 
sections using in situ hybridization, which identified miR-224 in 
both the neointimal and adventitial layers, but not the medial layer 
(Fig. 7 A, lower panel). We validated these findings using 
microarray based analysis of normal (no lesions), stable (asymp- 
tomatic) and unstable (symptomatic) carotid atherosclerotic 
lesions. TCF21 mRNA levels were significandy upregulated in 
both asymptomatic (P = 0.0106) and symptomatic (P = 0.0074) 
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Figure 3. Allele-specific miR-224 regulation of TCF21 3'UTR at rs12190287. Luciferase reporter assay of TCF21 rs12190287-C and G 3 -UTR 
variants determined in (A) primary coronary artery smooth muscle cells (HCASMC), (B) rat aortic smooth muscle cell line, A7r5, and (C) HeLa cell line. 
Negative control miRNA (miR Con), miR-224, miR-224_SNP, anti-miR negative control (anti-miR Con) or anti-miR-224 inhibitors were co-transfected 
with 3'-UTR reporters for 24 hrs and the relative luciferase activity (ratio of firefly/Ren/7/a luciferase activity) was measured and normalized to C-3'- 
UTR+miR Con or anti-miR Con, shown as fold change. Data represent mean ± SEM of triplicates. Similar results were observed from three 
independent experiments. P-values are shown for intra and inter-assay comparisons where statistically significant (P<0.05). 
doi:1 0.1 371 /journal.pgen.1 004263.g003 



atherosclerotic plaques (Fig. 7B). Interestingly, miR-224 was 
significantly downregulated in stable and unstable atherosclerotic 
plaques (P= 1.5xl0~ 5 and P = 8.2xlO~ B , respectively), as deter- 
mined by TaqMan qPCR (Fig. 7C). These data confirm that both 
TCF21 protein and miR-224 are expressed in the diseased vessel 



wall in vivo, and their expression is inversely regulated during 
atherosclerosis, consistent with our observations in HCASMC. 
Together these findings provide additional mechanistic insights 
into the TCF21 association with respect to coronary heart disease 
progression. 
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Figure 4. In vitro annealing kinetics between miR-224 and TCF21 3'UTR variants. (A) (Left panel) 7CF27 3'-UTR variants were generated by in 
vitro transcription (IVT) and incubated with excess over 32 P-labeled miR-224 for various time points, followed by autoradiography detection. Band 
intensities indicate relative amounts of the 3'-UTR variantmiRNA complexes formed over indicated times. (Right panel) IVT 3'-UTR variants were also 
incubated with excess over 32 P-labeled miR-224_SNP, resulting in a seed mismatch with the C variant and a seed match with the G variant. (B) Band 
intensities of 3'-UTR:miRNA complexes formed using 32 P-labeled miR-224 or 32 P-labeled miR-224_SNP were detected by Phosphorlmager and to 
quantify the percentage of complex signals ImageQuant-Software was used to determine relative to whole lane signal. Values represent mean ± SEM 
from three independent experiments. (C) Calculation of second-order rate constants for individual mRNA:miRNA complexes was performed as 
previously described [54]. n.d., complex formation was too slow to derive a rate constant. 
doi:1 0.1 371 /journal.pgen.1 004263.g004 
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Figure 5. Allele-specific structural differences in conformations of TCF21 3 -UTR RNAs determined by in vitro probing. (A) Results of 
chemical probing of the in vitro transcribed TCF21 3'-UTR variants with varying amounts of Pb 2+ (0, 10, 20 and 40 mM) and (B) enzymatic probing 
with varying amounts of RNase T1 (0, 0.25, 1 and 2 units). Major cleavage sites are shown along with their positions. miR-224 binding site is 
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cleavage strength is indicated by open and closed triangle and circles. Results are representative of at least three independent experiments. 
doi:1 0.1 371 /journal.pgen.1 004263.g005 



Genome-wide prediction of disease-associated variants 

overlapping both TF and miRNA binding sites 

It is also noteworthy that the minor protective allele at 
rsl2190287 disrupts both a TF binding motif TGACTTCA as 
well as a miRNA seed sequence, GUGACUU in the 3'UTR of 
TCF21. Given this unanticipated integration of both positive as- 
acting transcription factor binding and negative post-transcrip- 
tional miRNA regulation at TCF21, we sought to estimate the 
overall frequency of these overlapping regulatory features in 
humans using publicly available genome-wide datasets. Using 
validated TF binding ENCODE ChlP-seq regions (-4,400,000) 
intersected with medium conserved miRcode predicted miRNA 



binding sites (~ 1,100,000) we identified approximately 290,000 
overlapping regions (approximately 28% of all predicted miRNA 
binding sites) (Supplementary Fig. S3). We then intersected total 
disease-associated polymorphisms from the National Human 
Genome Research Institute (NHGRI) catalog, including those in 
LD at r 2 >0.8 (-292,000), resulting in 52,263 sites (17.9%) in TF 
ChIP regions, 942 sites (0.32%) in miRNA binding sites, and 146 
sites overlapping with both features (0.05%) (Fig. S3A and Table 
SI). Interestingly, this overlap was less frequent when applied to all 
common variants (12.8%, 0.16%, and 0.04%, respectively) 
(Supplementary Fig. S3B). We also observed 20,064 (37%) regions 
of TargetScan predicted conserved miRNA binding sites residing 
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Figure 6. Correlation of endogenous TCF21 and miR-224 expression levels in HCASMC. (A) TaqMan quantitative PCR results showing the 
correlation of endogenous TCF21 variant 1 expression levels with miR-224 in HCASMCs stimulated with recombinant human PDGF-BB (20 ng/ml) for 
various time points (0, 1 , 2, 6, 12, 24 hrs) (n = 1 6). (B) Similar experiments performed in HCASMCs stimulated with recombinant human TGF-(31 (5 ng/ 
ml) for various time points (0, 1, 2, 6, 12, 24 hrs) (n = 16). TCF21 and miR-224 expression levels were normalized to 18S and RNU44, respectively. 
Pearson's correlation was determined assuming a linear relationship, with resulting r, and P-values shown. (C) TaqMan qPCR results measuring total 
TCF21 transcript levels in HCASMC transfected with Negative control miRNA mimic (miR Con) or miR-224 mimic and stimulated with either PDGF-BB 
or TGF-pi for 6 hrs. Data represent mean ± SEM of triplicates. Similar results were observed from three independent experiments. (D) Allele-specific 
TaqMan qPCR measuring TCF21 expression at rs1 21 90287 in HCASMCs treated as described above. Values are expressed as the normalized ratio of CI 
G alleles. Data represent individual replicates from three independent experiments (n = 8-9). 
doi:1 0.1 371 /journal.pgen.1 004263.g006 



within TF ChlP-seq peaks. Functional annotation of these regions 
resulted in significant enrichment of mitogen activated protein 
kinase (MAPK) (P=lxl(T 39 ), cytokines (P=lxl(T 54 ), and 
TGF-P (P=lxl(T 23 ) pathways (Supplementary Fig. S3D), as 
well as bZip (P= lxlO" 81 ) and p53 (P= lxlO" 29 ) TF binding 
protein domains versus those expected by chance (Supplementary 
Fig. S3C). Given the critical role of bZip domain TF families (e.g. 
AP-1, ATF and CREB) in various cancers, inflammation and 
developmental processes [26], concurrent miRNA binding to 
mRNA regions overlapping these sites (e.g. miR-224- TCF21) may 
represent an exquisite fine-tuning control of target gene expres- 
sion. 

Discussion 

A large fraction of CHD susceptibility loci recendy identified 
through GWAS do not appear to be mediating risk through effects 
on traditional risk factors, such as lipid levels and blood pressure. 



Investigating the mechanism(s) of the disease risk association at 
these loci promises to provide critical new information regarding 
fundamental disease pathways in the vessel wall that function 
upstream of the causal variation and the related causal gene 
[1,2,27]. One gene that we have chosen to study in this regard is 
TCF21, a gene that was originally identified and replicated in the 
CARDIoGRAM meta-analysis of GWA data, and has now been 
verified through additional meta-analysis in both subjects of 
European and Han Chinese descent [1—3]. TCF21 encodes a 
basic-helix-loop-helix transcription factor that is involved in 
controlling cell fate decisions in developing coronary artery 
SMC, and may provide insight into the possible causal role of 
this cell type in atherosclerosis [8,9] . 

Initial eQTL analysis showed that TCF21 expression is related 
to the genotype at rsl2190287, providing the first suggestion that 
TCF21 is indeed the causal gene at this locus (Table S3) [2]. These 
data, in conjunction with evidence that rsl 2 190287, (1) is 
associated with a P-value that is three orders of magnitude lower 
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Figure 7. Expression of TCF21 and miR-224 in human atherosclerotic lesions. (A) (Upper panel) Immunohistochemical staining results 
showing endogenous TCF21 protein expression (in red) in neointima and media regions of left anterior descending (LAD) coronary artery sections 
(lOx magnification). Adjacent sections were incubated with rabbit serum (negative control) or anti-alpha-smooth muscle actin (a-SMA) antibody to 
localize smooth muscle-like cells. Methyl Green was used as a nuclear counterstain. (Lower panel) Representative in situ hybridization results showing 
endogenous miR-224 (in indigo) localized in the neointima and adventitia in adjacent LAD sections (20 x magnification). Rb: rabbit, LNA: locked 
nucleic acid, a: adventitia, m: media, ni: neointima, fc: fibrous cap, necrotic core. Arrows denote specific staining. Scale bars = 0.5 mm. (B) Microarray 
gene expression results showing regulation of TCF21 mRNA and (C) TaqMan quantitative PCR results depicting miR-224 expression, during disease 
progression in normal, stable and unstable human carotid atherosclerotic lesions (n = 10 per group). Microarray-based expression levels were 
normalized by robust multi-array average (RMA) and TaqMan-based levels were normalized to the RNU44 internal control. Values represent mean 
Log2 fold change of replicates and lines represent mean ± SEM. Similar results were observed from two independent experiments. 
doi:1 0.1 371 /journal.pgen.1 004263.g007 



than that for other associated SNPs within the susceptibility locus 
[2], (2) is only modestly correlated with other SNPs that reach 
genome-wide significance within the locus [10], (3) is found in a 
region of open chromatin configuration [10], and (4) resides within 
the TCF21 structural gene, collectively suggest that rs 1 2 1 90287 is 
the causal variant within this susceptibility locus. To further 
investigate this possibility, we have pursued allele-specific expres- 
sion (ASE) studies as reported here, seeking to correlate ASE with 
genotype at rsl2 1 90287. These studies employing RNA from 



circulating leukocytes show highly significant ASE at the TCF21 
gene, and consistent allelic expression divergence suggests that 
rsl 2 190287 is the causal SNP. Although possible, it would seem 
very unlikely that the ASE is due to another SNP in high linkage 
disequilibrium with this SNP, since other associated variants are 
correlated at best with an r 2 ~0.6. Also, while leukocytes are an 
appropriate cell type in atherosclerosis and express a number of 
the signaling components upstream of TCF21, they may not be the 
primary cell type reflecting TCF21 function. In this regard, we 
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Figure 8. Proposed model of signaling pathways converging on miR-224- TCF21 interaction at rsl 21 90287. Previous studies elucidated 
a c/s-regulatory mechanism by which the lead SNP associated with CHD at 6q23.2, rs1 21 90287, was shown to disrupt trans-activating AP-1 binding by 
the minor protective allele (G). This resulted in altered growth factor mediated transcriptional activation, chromatin organization and allele-specific 
TCF21 gene expression in human coronary artery smooth muscle cells (HCASMC). The frans-repressing factor, Wilms tumor 1 (WT1) was also shown to 
counter-regulate the positive effects of AP-1 at rs12190287 and preferentially associate with the major risk allele (C). Herein we describe a post- 
transcriptional c/s-regulatory mechanism by which the minor protective allele alters a perfect seed match of miR-224 in the 3'-UTR of TCF21. Altered 
RNA structure is predicted to account for differential miRNA binding kinetics, and regulation of transcription. Further, both PDGF-BB and TGF-p 
upstream stimuli in HCASMC may account for the miR-224 mediated allele-specific expression at rsl 21 90287. Additionally, both NFkB and Wnt 
upstream signals have been proposed to regulate miR-224 in tumor cells, and may potentially participate in the described mechanisms in HCASMC. 
Taken, together dysregulation of TCF21 is predicted to account for altered smooth muscle cell (SMC) response to injury due to phenotypic 
modulation from a differentiated to proliferating SMC, leading to increased risk for CHD. 
doi:1 0.1 371 /journal. pgen. 1 004263.g008 



observed a consistent direction of ASE in a limited study of 
primary cultured HCASMC grown in the presence of serum. 
Unfortunately, eQTL studies with circulating leukocytes did not 
show a significant association with TCF21 expression so we could 
not compare the directionality between the leukocytes and the 
adipose and liver tissues employed in the original eQTL studies. 

miRNAs predominately affect gene expression by decreasing 
mRNA stability or inhibiting translation. These regulatory effects 
can be perturbed by allelic variation through SNPs directly 
interfering with basepair interactions in the seed sequence in the 
mRNA. Allelic variants can also alter the tertiary structure of the 
mRNA and hinder miRNA binding even when the SNP is located 
outside the seed sequence [28,29]. Here, we employed reporter 
gene studies in HeLa cells, rat and human SMCs with both gain 
and loss of function approaches to demonstrate that miR-224 
regulates TCF2 1 expression at the protein level. Sequence analysis 
predicts that rs!2190287 alters the core miR-224 binding 
sequence, and folding algorithms that identify lowest energy 
conformations of the native and variant sequences suggest that the 
minor G allele at rsl2190287 produces a less favorable configu- 
ration of mRNA folding for miR-224 binding. These hypotheses 
were confirmed by kinetic studies showing decreased rate and 
extent of miR-224 binding, and RNA structural probing studies 
that revealed decreased availability of the miR-224 binding region 
in the mRNA containing the minor G allele. The disruption of 
miRNA binding is a well-established mechanism for alteration of 
risk for various cancers [30] . However, these data showing that the 
CHD causal variant rsl 2 190287 can disrupt miR-224 binding 
provides the first evidence for this type of mechanism for coronary 
disease associated genes. 



While a potential role for miR-224 in regulating vascular disease 
has not been defined, this miRNA has been studied in association 
with multiple cancer cell types and other cellular systems, and 
these data provide some insight into upstream pathways that might 
affect TCF21 expression and thus CHD risk [24,31-36]. Most 
significant among these are NFkB, WNT and TGF-p\ all of which 
have been linked to atherosclerotic signaling pathways [24,34,36]. 
NFkB is a well-characterized transcription factor and mediator of 
cellular activation by inflammatory cytokines and chemokines, and 
in the context of hepatocellular carcinoma, miR-224 was shown to 
be upregulated by tumor necrosis-oc (TNF-a) and miR-224 
regulation linked to hepatocellular migration and invasion [31]. 
TGF-P stimulation of miR-224 expression has been characterized 
in ovarian granulosa cells where it has been implicated in cellular 
proliferation and estradiol release in this cell type [32]. Further, 
miR-224 has been shown to be upregulated by the WNT signaling 
pathway in meduloblastoma where it was linked to inhibition of 
proliferation, increased radiation sensitivity and reduced anchor- 
age-independent growth of tumor cells [35]. Each of these 
pathways has been linked to atherosclerotic processes in the 
diseased blood vessel wall, and could have a role in the TCF21 
mediated risk for CHD [24,33,34,36]. 

Merging these data with that from previous studies of the 
transcriptional regulation at rsl 2 190287 provides a more complete 
picture of the complexity of upstream signaling pathways that may 
regulate TCF21 expression, and may be perturbed by this disease- 
associated variant. We have shown that rsl 2 190287 resides in an 
atypical AP-1 -like element and that PDGF can stimulate allele- 
specific expression through this site, as one potential disease- 
related pathway activated at this region [10]. PDGF has been 
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extensively implicated in atherosclerosis pathogenesis, and in vitro 
genomic studies have suggested that TCF21 mediates PDGF 
signaling ([37], and data not shown). Additionally, transcriptional 
regulation studies at rsl2190287 have also shown that the Wilms 
tumor factor (WT1) inhibits expression of TCF21 through the AP- 
I-like site, and PDGF and TGF-fS stimulation shown to be 
upstream inhibitors of WT1 expression in SMC [10]. WT1 is 
known to inhibit expression of AP-1 like factors, and has been 
shown to repress TCF21 expression in developmental models [38- 
40]. Combining these data with that derived here for miR-224 
provides compelling evidence for multiple signaling pathways, 
operating by transcriptional and post-transcriptional mechanisms, 
by which rsl2 190287 regulates TCF21 expression (Fig. 8). 
Importantly, this is the first example of a disease-associated 
variant that disrupts both transcription factor-DNA and miRNA- 
mRNA interactions. Our genome-wide analysis provides further 
support that additional disease-associated variants reside in 
overlapping TF and miRNA binding regions, which likely have 
pathophysiological relevance (Supplementary Fig. S3). We can 
speculate that this bimodal regulation may partially explain the 
"dynamic" eQTLs previously observed [41,42] which are 
responsive to intracellular changes in differentiation state. 

Previous work from this laboratory has characterized a 
transcriptional regulatory mechanism that mediates TCF21 gene 
expression differences through variation at rs 1 2 1 90287, and 
studies presented here documents a second mechanism by which 
variation at this SNP can alter expression of individual alleles 
[10]. Inherent in such mechanistic studies at disease-associated 
loci is that altered allele-specific expression can alter disease risk 
through either, 1) changing the overall causal gene expression 
level to alter the normal biological role of the causal gene or to 
introduce a novel disease-promoting function; or 2) changing the 
overall variance of causal gene expression, such that the gene 
becomes disconnected from its normal signaling networks [43]. 
The former may be explained by increased transcription of the 
rsl2190287 risk "C" allele resulting in overall increased TCF21 
expression, or the miRNA mechanism by which the "C" allele 
interacts with miR-224 to decrease overall TCF21 expression. It is 
possible that these counteracting pathways trigger overall TCF21 
expression variance linked to a disease-related phenotype, with 
different pathways being dominant in different disease contexts. 
Evidence for directionality of TCF21 expression is provided here 
in the context of carotid artery disease, with atherosclerotic 
vessels showing increased TCF21 expression (Fig. 7). These data 
are consistent with eQTL data in adipose and liver tissue samples 
which indicated that the risk "C" allele at rsl 2 190287 is 
associated with increased TCF21 expression, suggesting that the 
transcriptional mechanism at rsl 2 190287 may play a dominant 
role on gene expression levels (Table S3). In addition, we also 
demonstrate that miR-224 is reciprocally decreased in these 
carotid diseased tissues, suggesting that miR-224 may function as 
a repressor of aberrantly elevated TCF21 levels, but is blunted in 
the process. It is important to consider that transcription factor 
and miR-224 pathways are regulated by multiple upstream 
pathways, which can regulate expression and/or activation of 
TFs and miR-224. Verification of the direction of effect for 
TCF21 expression, and the mechanisms that function at 
rs 1 2 1 90287 will require additional studies with human vascular 
disease samples to better assess in vivo gene expression in the 
disease environment. Also, studies in Tcf2 1 genetic mouse models 
should provide additional evidence of the direction, and 
mechanism of effect, in the setting of vascular disease. It is well 
known that TCF21 is protective for multiple human cancers and 
it will be of great interest to determine whether expression of this 



gene in disease-related cells inhibits or promotes vascular disease 
processes [44-47]. 

Finally, to fully understand the complexity of transcriptional 
and miRNA networks regulating TCF21 expression, it will be 
essential to investigate the risk contributed by additional alleles 
associated with disease at this locus. GWAS of an East Asian 
cohort identified the CHD associated variant rsl 2524865 in the 
TCF21 locus, and the fact that this SNP is poorly correlated with 
rsl 2 190287 in this racial ethnic group suggest that it is an 
independent allele [3]. Previous studies from this laboratory 
suggested that this variant may direcdy regulate TCF21 expression 
through transcriptional pathways similar to those associated with 
rsl 2 190287 [10]. In addition, fine-mapping studies by the 
CARDIoGRAM+C4D consortium has identified a second asso- 
ciated allele in Caucasians centered around a variant ~ 100,000 
basepairs upstream of TCF21, rsl 7062853, with this variant being 
poorly correlated with rs 1 2 1 90287 in Caucasians, again suggesting 
this is an independent allele [1]. It will be important to investigate 
these alleles independently and collectively to begin to understand 
how they contribute to TCF21 regulation in the context of smooth 
muscle biology and disease processes. To better assess the 
pathophysiological role of different alleles it will be essential to 
conduct ASE analyses in cells isolated from human vascular 
lesions. These future studies may reveal how allelic variation at 
TCF21 affects relevant upstream signaling pathways during 
different disease states. 

Materials and Methods 

Allelic expression imbalance using TaqMan quantitative 
PCR 

Peripheral blood DNA and RNA were isolated from randomly 
selected buffy coat samples from individuals of European descent 
in two human cohort studies, GENEPAD (Genetic determinants 
of Peripheral Artery Disease) and GENESiPS (GENEticS of 
insulin sensitivity iPSc). Genomic DNA was isolated using the 
Qiagen DNeasy Blood and Tissue kit according to the manufac- 
turer's instructions. Genotypes at rsl2190287 were determined 
from 10 ng gDNA template using a predesigned TaqMan SNP 
genotyping assay for rsl 2 190287 (Applied Biosystems) and 
performed in triplicate. Sanger sequencing also confirmed 
heterozygous samples at rs 12 1 92087. Total RNA was isolated 
using the Qiagen miRNeasy Mini kit according to the manufac- 
turer's instructions. Total cDNA was prepared from 1 |j,g RNA 
using the High Capacity cDNA Reverse Transcription kit (Applied 
Biosystems, #4368814). cDNA templates were used to amplify 
allele-specific TCF21 using the TaqMan SNP genotyping probe 
(Applied Biosystems). ASE was determined from 22 heterozygous 
samples using the TaqMan SNP genotyping probe for rsl2190287 
and expressed as the normalized allelic ratio of cDNA/gDNA. 
Calibration of the SNP genotyping assay was determined as 
previously described [10]. 

Allelic expression imbalance using pyrosequencing [10] 

DNA and RNA were prepared from 8 individual HCASMC 
lots determined to be heterozygous at rsl2192087 (confirmed by 
Sanger sequencing). Pyrosequencing assays for rs 1 2 1 90287 were 
performed as previously described with assays designed using 
PyroMark Assay Design software (Qiagen). Forward rs 12 190287 
PCR primer, 5 ' -biotinylated reverse PCR primer, and forward 
pyrosequencing primers (Table S2) were synthesized by the 
Protein And Nucleic acid (PAN) facility (Stanford). Approximately 
20 ng gDNA or cDNA was amplified using forward and reverse 
pyrosequencing primers under the following conditions: 94°C 
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4 min, (94°C 30 s, 60°C 30 s, 72°C 45 s) x45, 72°C 6 min. PCR 
products were verified by gel electrophoresis. Pyrosequencing 
reaction was performed on PCR reactions using a PyroMark Q24 
according to manufacturer's instructions. Allelic quantitation was 
obtained automatically from the mean allele frequencies derived 
from the peak heights using PyroMark Q24 software. 

Cell culture [10] 

Primary human coronary artery smooth muscle cells 
(HCASMG) were purchased from three different manufacturers, 
Lonza, PromoCell and Cell Applications and were cultured in 
complete smooth muscle basal media (Lonza, #CC-3182) 
according to the manufacturer's instructions. All experiments 
were performed with HCASMC between passages 4—7. Genotypes 
of HCASMC were determined as described above, and lots 
heterozygous at rs 12 190287 were used for all experiments. The 
A7r5 rat aortic SMC line was purchased from ATCC and cells 
were maintained in Dulbecco's modified Eagle medium (DMEM, 
Life Technologies, ^=1 1885-084) containing low glucose, sodium 
pyruvate and L-glutamine and supplemented with 10 % fetal 
bovine serum (FBS). HeLa cells were maintained in DMEM 
containing high glucose, sodium pyruvate and L-glutamine 
supplemented with 10% FBS. 

Dual-luciferase assay 

Double stranded DNA sequences containing the TCF21 3'- 
UTR for rsl2190287-C and G were subcloned into the multiple 
cloning site (MCS) of the pmirGLO vector (Promega, #E1330), 
located downstream of the translation stop codon and firefly 
luciferase reporter gene luc2, driven by the PGK minimal 
promoter and also carrying the renilla luciferase reporter gene 
hRluc, as an internal control. PCR and mutagenic primer 
sequences to generate the TCF21 C and G 3'-UTR reporters 
are included in Table S2. Site-directed mutagenesis protocol was 
adapted from [48]. For gain-of-function studies, single-stranded, 
unmodified oligonucleotides for miR-224 (seed-matching TCF21 
C allele) and miR-224_SNP (seed-matching TCF21 G allele) were 
first annealed at an equimolar concentration at 95°C for 3 min 
and allowed to gradually cool to room temperature. Resulting 
double-stranded miR-224, miR-224_SNP or negative control 
miRNAs (Ambion/Life Technologies) were co-transfected at 
50 nmol/L along with TCF21 C-3'-UTR or G-3'UTR reporter 
constructs in HeLa, HCASMC or A7r5 using Lipofectamine 2000 
(Invitrogen/Life Technologies, #11668-019) according to the 
manufacturer's instructions. Alternatively, loss-of-function studies 
were carried out by co-transfecting 50 nmol/L anti-miR-224 or 
negative control anti-miR inhibitors (Ambion/Life Technologies). 
Culture media was changed after 6 hrs, and dual luciferase activity 
was measured after 24 hrs using either SpectraMax L luminom- 
eter (Molecular Devices) or anthos Lucy3 luminometer (anthos 
Mikrosysteme GmbH). Relative luciferase activity (firefly/ Renilla 
luciferase ratio) is represented as the fold change of respective 
control condition as indicated. 

Quantitative miRNA, total mRNA and allele-specific gene 
expression 

HCASMC were maintained as described above under normal 
growth factor and serum supplemented conditions. Upon reaching 
~70% confluence cells were serum-starved overnight prior to 
stimulation with human recombinant PDGF-BB or TGF-Pl for 
various times in triplicates. Samples were randomized (n= 16) and 
total RNA was isolated using the miRNeasy Mini kit (Qiagen). 
Total cDNA was prepared from 1 ug RNA using the High 



Capacity cDNA Reverse Transcription kit (Applied Biosystems, 
#4368814). Alternatively, miRNA specific cDNA was prepared 
using the TaqMan miRNA Reverse Transcription kit (Applied 
Biosystems/Life Technologies, #4366596) and predesigned RT 
probes for human miR-224 or human control miRNA RNU44 
(Applied Biosystems/Life Technologies). cDNA templates were 
used to measure endogenous human miR-224 and TCF21 variant 
1 (TCF21 vl) expression levels using predesigned TaqMan gene 
expression assay probes (Applied Biosystems/Life Technologies) 
according to the manufacturer's instructions. TCF21 vl and miR- 
224 levels were quantitated on a ViiA 7 Real-Time PCR system 
(Applied Biosystems) and normalized to 18S and RNU44 levels, 
respectively. Pearson's correlation was determined assuming a 
linear relationship. 

For expression analyses with miR-224 overexpression, 
HCASMC were cultured as described above under normal 
conditions. The day after plating the cells were transfected with 
either miR negative control (miR Con) or miR-224 mimic using 
Lipofectamine RNAiMAX (Life Technologies, #13778150) for 
5 hrs. Culture media was changed to serum-free and cells were 
incubated overnight prior to stimulation for 6 hrs with either 
vehicle, 20 ng/ ml human recombinant PDGF-BB (R&D Systems, 
#220-BB-010), or 5 ng/ml human recombinant TGF-Pl (R&D 
Systems, #240-B-002). Total RNA was isolated using the RNeasy 
isolation kit (Qiagen, #217004) and cDNA was prepared as 
described above. Total TCF21 or allele-specific expression at 
rsl 2 192087 was measured as described above using the TaqMan 
gene expression or TaqMan SNP genotyping probe with 
expression levels calculated using a standard curve and normalized 
to the gDNA for each allele. 

Determination of annealing rate constants for 
complementary RNA [18] 

Observed association rate constants (A; 0 bs) were measured as 
previously described in detail [49,50]. Briefly, 5' radioactively 
labeled miR-224 or miR-224_SNP (0.5 nM final concentration) 
was incubated with the TCF21 3'-UTR-C or TCF21 3'-UTR-G 
target mRNA at 5 nM final concentration in hybridization buffer 
(100 mM NaCl, 20 mM Tris-HCl, pH 7.4, and 10 mil MgCl 2 ) 
in the presence of 10 mM CTAB at 37°C. Aliquots were 
withdrawn at different time points, transferred into 1 vol of stop 
buffer (20 mM Tris-HCl, pH 7.4, 10 mM EDTA, 2% (v/v) 
SDS, 8 M urea, 0.025% (v/v) bromophenol blue) and analyzed 
by native polyacrylamide gel electrophoresis (0, 1 xl 1 xl2 cm, 
run at 4°C and 150 V for 2 h). Gels were sealed in polyethylene; 
exposed to X-ray film stored at — 20°C until band intensities 
were determined using a phosphorimager (Typhoon 8600 
Variable Mode Imager, GE Healthcare). ImageQuant 5.2- 
software was used to quantify signals relative to whole lane 
signal. Second order association rate constants were calculated as 
described [20]. 

Probing and primer extension [18] 

PCR products harboring the RNA polymerase T7 recognition 
site were amplified for TCF21 with 5'-GAA ATT AAT ACG ACT 
CAC TAT AGG GCC TTG GAG TTT GGT ACC TGG-3' as 
forward, 5'- TCA GGT CGA CTT GGT GGA ACA AAT CTT 
TTA TTT TC-3' as reverse primer and the pmirGLO TCF21 3'- 
UTR constructs as template and used for in vitro transcription (T7 
RiboMAX Express Large Scale RNA Production System, 
Promega, #P1320). In vitro transcripts (IVTs) were purified by 
phenol-chloroform-extraction, G-50 column filtration and ethanol 
precipitation, and subsequently denatured for 10 min at 70°C and 
refolded at room temperature for 120 min. RNase Tl based 
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hydrolysis: IVTs were incubated at RT for 4 min in 10 jU.1 reaction 
containing 1 ug tRNA (Sigma, #83853-25MG) and increasing 
RNase Tl (0, 0.25, 1 and 2 units, Fermentas, #EN0541). 
Cleavage products were purified as described above. Pb 2+ based 
probing: refolded IVTs were incubated for 15 min at RT in a 
1 0 ul reaction mix. Reactions were initiated with 5 ug tRNA and 
increasing amounts of lead(II) acetate (Pb 2+ ) (Sigma, #32307), 
terminated after 10 min at RT with EDTA/ethanol, followed by 
ethanol precipitation. RT reactions with either RNase T 1 - or lead 
hydrolysis products were performed for 45 min at 42 °C using 
1 mil dNTPs, 2.5 mM RT-primer (5'- 32 [P] -AGO GCA TCC 
TGA CAT CTT GA-3') and 1.5 units AMV Reverse Transcrip- 
tase (Promega, #M5108). Sequencing reactions were performed 
in parallel with denatured IVT (2 min at 95°C) for each nucleotide 
base, as adapted from [49]. After cDNA synthesis, samples were 
denatured in formamide-containing loading buffer for 3 min at 
95°C and resolved on a 10% polyacrylamide sequencing gel under 
denaturing conditions for 70 min at 52°, and signals analyzed with 
a Phosphorlmager (Typhoon 8600 Variable Mode Imager, GE 
Healthcare). 

miRNA annealing [18] 

Single-stranded miRNA guide and passenger strands (miR-224 
and miR-224_SNP, Fig. 2B; miR-224 guide: 5'-CAA GUC ACU 
AGU GGU UCC GUU-3', miR-224_SNP guide: 5'-CAA CUC 
ACU AGU GGU UCC GUU-3' and miR-224 passenger: 5'- 
AAA AUG GUG CCC UAG UGA CUA CA -3') were 
synthesized by biomers.net GmbH. Double-stranded miRNA 
was generated by incubating the two strands at a final 
concentration of 20 |J,M in 1 x RNA annealing buffer (6 mM 
Tris-HCl pH 7.4, 20 mM KC1, 0.4 mM MgCl 2 ). The annealing 
reaction was performed by denaturing the oligonucleotides (3 min 
at 95°C) and subsequent slow cooling in a heat block. The 
hybridization product was analyzed by native PAGE. 

Computational analysis of RNA secondary structure [18] 

In silico folding of RNA sequences was performed using an 
adaptation of the mfold package [51,52] that has been modified to 
work with the Accelrys Genetics Computer Group. The calcula- 
tions were performed with the polymorphic sequence segments 
containing the SNP at varying internal positions and by defining 
stepwise (10-25 nt) moving segments with sizes of 100, 200, 400 
and 800 nt. The resulting structures were compared globally and 
locally at the SNP position and/or the respective miRNA-binding 
site and grouped according to the involvement of the SNP- 
containing sequence segment in intramolecular folding. We 
validated our predicted structures with the RNAfold package 
(University of Vienna) using minimum free energy (MFE) based 
structure calculations from varying length segments containing the 
SNP. 

In silico transcription factor and miRNA binding 
intersection and enrichment analysis 

Genome-wide binding regions for hgl9 ENCODE transcrip- 
tion factor ChIP V3 and miRcode VI 1 or TargetScan miRNA 
binding sites were extracted using the Galaxy tool. Resulting bed 
files were intersected with latest GWAS SNP catalog (in 
European populations) from the National Human Genome 
Research Institute (NHGRI), augmented with SNPs in LD at 
r 2 >0.8, to identify overlapping positions. Overlapping genomic 
regions of transcription factor binding and TargetScan miRNA 
binding sites were imported into the Genomic Regions Enrich- 
ment of Annotations Tool (GREAT) for functional assignment by 



pathway and motif analyses. Statistical enrichments were 
performed for associations between the overlapping genomic 
regions and the annotations using the whole genome as a 
background region. 

Immunohistochemistry 

Major coronary arteries were dissected from explanted hearts 
of patients undergoing heart transplant at Stanford, as 
previously described [53]. Briefly, left anterior descending 
(LAD), circumflex, and right coronary arteries were dissected 
and macroscopically scored as disease (containing lesion) or 
normal (lesion-free), rinsed in saline and fixed in 4% parafor- 
maldehyde overnight at 4°C, followed by cryopreservation in 
10%, 20%, and 30% sucrose at 4°C for 30 min, 1 hr, and 2 hrs, 
respectively. Coronary segments were embedded in OCT media 
prior to sectioning at 7 Um thickness. Frozen slides were thawed 
and immunohistochemistry procedure was performed according 
to the manufacturer's protocol (Biocare Medical, #RMR625). 
Briefly, tissue sections were blocked for 30 min using a uni- 
versal blocking reagent and endogenous peroxidases were 
quenched prior to incubation with rabbit anti-TCF21 
(Abeam, #ab49475), mouse anti-ACTA2 (a-SMA; Sigma, 
#SAB 14035 19) primary antibodies or rabbit serum as a 
negative control (purified rabbit or mouse IgG were also used 
as negative control antibodies). Sections were washed in tris 
buffered saline (TBS) and incubated in respective alkaline 
phosphatase (AP) conjugated polymers for 30 min followed by 
detection using Vulcan Fast Red chromogen (Biocare Medical, 
#FR8()5). Nuclei were counterstained using Methyl Green 
(Vector Labs, #H3402). Images were captured on a Zeiss light 
microscope and total brightness and contrast were uniformly 
adjusted for each condition. 

In situ hybridization 

Unlabeled miR-224 locked nucleic acid (LNA) and scrambled 
LNA control oligo probes were purchased from Exiqon and 
100 pmol oligos were labeled using the digoxigenin (DIG) 
Oligonucleotide Tailing Kit, 2 nd generation (Roche, #3-353- 
583) according to the manufacturer's instructions. Labeled probes 
were purified using Sephadex G25 columns (GE Biosciences, 
#27-5325-01) according to the manufacturer's instructions and 
labeling efficiency was measured via dot blot analysis using serial 
dilutions of labeled LNA oligo and Control DIG-dUTP/dATP 
tailed oligo with detection using an anti-DIG-AP conjugated 
antibody (Roche, #1093274) and NBT/BCIP developer (Roche, 
#1 1697471001). Probes were diluted in hybridization buffer to a 
final concentration of 25 or 50 nM and linearized for 5 min at 
65°C. Probes were added to thawed slides and incubated at 55°C 
in a humidified chamber for 2 hrs. Slides were washed with 5X, 
IX, 0.2X SSC buffer for 15, 30, 15 min respectively, followed by 
15 min wash in phosphate buffered saline (PBS). Slides were 
incubated in blocking solution containing 5% heat-inactivated 
sheep serum, 1 % bovine serum albumin, 0. 1 % Tween-20 in 
RNase-free PBS. Slides were then incubated with AP-conjugated 
anti-DIG Fab fragment antibody (1:1500, Roche, #1093274) for 
2.5 hrs at RT. Slides were washed for 2x30 min in PBS-Tween 
0.1% and 2x20 min in PBS. Signal was detected by incubating 
with NBT/BCIP developer with 1 mM Levamisole (Sigma) for 
36-48 hr at RT in the dark. Nuclei were counterstained with 
Nuclear Fast Red (Vector) for 5 min, washed in running H20 
and slides coverslipped with aqua-poly/ mount (Polysciences). 
Images were obtained at 20 x magnification using a light 
microscope. 
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Microarray and TaqMan based gene expression in human 
atherosclerotic carotid arteries 

Human atherosclerotic carotid artery lesions were obtained 
from patients undergoing endarterectomy surgery for stable 
(asymptomatic) (n = 40) or unstable (symptomatic) (n = 87) carotid 
stenosis, as part of the Biobank of Karolinska Endarterectomies 
(BiKE). Normal control arterial samples (n=10) were obtained 
from the iliac and radial arteries from healthy organ donors 
without any history of cardiovascular disease. Briefly, tissue was 
snap frozen in liquid nitrogen before pulverizing to a fine powder 
using a pre-chilled mortar and pestle, then resuspended in Qiazol 
lysis reagent (Qiagen) and homogenized with a rotor stator tissue 
homogenizer. Total RNA was extracted as described above using 
the miRNeasy Mini Kit (Qiagen) and RNA quality assessed using 
a Bioanalyzer 2100 (Agilent). Global gene expression profiles were 
analyzed by Affymetrix HG-U133 plus 2.0 Genechip microarrays 
from 127 patient derived plaque samples and 10 donor control 
samples. Robust multi-array average (RMA) normalization was 
performed and processed gene expression data presented in Log2 
scale. For TaqMan based analysis, miRNA-specific cDNA was 
prepared as described above, and TaqMan qPCR was performed 
in triplicates using predesigned TaqMan probes for miR-224 and 
normalized to the RNU44 internal control. Data are represented 
as mean Log2 fold change of replicates from two independent 
experiments. 

Statistical analysis 

Experiments were performed using at least three independent 
preparations with individual treatments/conditions performed in 
triplicate [10]. Data is presented as mean ± standard error mean 
(SEM) of replicates. GraphPad Prism 6.0 was used for statistical 
analysis. For all in vitro comparisons between two groups, paired 
two-tailed Hest was performed. For carotid artery expression 
analyses between normal donor and endarterectomy plaque 
samples, unpaired two-tailed /-test with Welch's correction was 
performed. Pvalues<0.05 were considered statistically significant. 
For multiple comparison testing, two-way analysis of variance 
(ANOVA) accompanied by Tukey's post-hoc test were used as 
appropriate. 

Ethics statement 

All samples reported in this study were obtained with approval 
of the Institutional Review Board at Stanford University and 
under written informed consent from patients undergoing 
orthotopic heart transplantation (coronary arteries from explanted 
hearts), or those participating in the Genetic Determinants of 
Peripheral Artery Disease (GENEPAD) and Genetics of Insulin 
Sensitivity iPSC (GENESiPS) studies (peripheral blood). All 
atherosclerotic carotid plaque and donor control samples collected 
from the Biobank of Karolinska Endarterectomies (BiKE) were 
obtained with informed consent from patients, organ donors or 
their guardians. The BiKE study is approved by the Ethical 
Committee of Northern Stockholm. 
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Figure SI Predicted minimal free energy based RNA structure 
of major and minor alleles of TCF21 3'-UTR using the RNAfold 
algorithm. Arrow and circle denotes location of rsl 2 1 90287. Grey 
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